Designing Caption Production Rules Based on Face, Text and Motion Detections

نویسندگان

C. Chapdelaine

M. Beaulieu

L. Gagnon

چکیده

Producing off-line captions for the deaf and hearing impaired people is a labor-intensive task that can require up to 18 hours of production per hour of film. Captions are placed manually close to the region of interest but it must avoid masking human faces, texts or any moving objects that might be relevant to the story flow. Our goal is to use image processing techniques to reduce the off-line caption production process by automatically placing the captions on the proper consecutive frames. We implemented a computer-assisted captioning software tool which integrates detection of faces, texts and visual motion regions. The near frontal faces are detected using a cascade of weak classifier and tracked through a particle filter. Then, frames are scanned to perform text spotting and build a region map suitable for text recognition. Finally, motion mapping is based on the Lukas-Kanade optical flow algorithm and provides MPEG-7 motion descriptors. The combined detected items are then fed to a rule-based algorithm to determine the best captions localization for the related sequences of frames. This paper focuses on the defined rules to assist the human captioners and the results of a user evaluation for this approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing caption production rules based on face, text, and motion detection

متن کامل

Improving video captioning for deaf and hearing-impaired people based on eye movement and attention overload

Deaf and hearing-impaired people capture information in video through visual content and captions. Those activities require different visual attention strategies and up to now, little is known on how caption readers balance these two visual attention demands. Understanding these strategies could suggest more efficient ways of producing captions. Eye tracking and attention overload detections ar...

متن کامل

PICTION: A ystem that Uses Captions to Human Faces in Newspaper Photographs*

It is often the case that linguistic and pictorial information are jointly provided to communicate information. In situations where the text describes salient aspects of the picture, it is possible to use the text to direct the interpretation (i.e., labelling objects) in the accompanying picture. This paper focuses on the implementation of a multi-stage system PICTION that uses captions to iden...

متن کامل

AutoCAP: An Automatic Caption Generation System based on the Text Knowledge Power Series Representation Model

This paper describes Automatic Caption generation for news Articles, it is an experimental intelligent system that generates presentations in text based on the text knowledge power series representation model. Captions or titles are useful for users who only need information on the main topics of an article. Using current extractive summarization techniques, it is not able to generate a coheren...

متن کامل

بررسی تأثیر نمایه‌سازی مفهوم-محور تصاویر بر بازیابی آن‌ها با استفاده از موتور جستجوی گوگل

Purpose: The purpose of the present study is to investigate the Impact of Concept-based Image Indexing on Image Retrieval via Google. Due to the importance of images, this article focuses on the features taken into account by Google in retrieving the images. Methodology: The present study is a type of applied research, and the research method used in it comes from quasi-experimental and techno...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Designing Caption Production Rules Based on Face, Text and Motion Detections

نویسندگان

چکیده

منابع مشابه

Designing caption production rules based on face, text, and motion detection

Improving video captioning for deaf and hearing-impaired people based on eye movement and attention overload

PICTION: A ystem that Uses Captions to Human Faces in Newspaper Photographs*

AutoCAP: An Automatic Caption Generation System based on the Text Knowledge Power Series Representation Model

بررسی تأثیر نمایه‌سازی مفهوم-محور تصاویر بر بازیابی آن‌ها با استفاده از موتور جستجوی گوگل

عنوان ژورنال:

اشتراک گذاری